Introduction
What is the bugRzilla Package?
BugRzilla is an R package that helps the user to interact with the Bugzilla through an API.
To learn more, see bugRzilla.
About the bugRzilla Google Summer of COde Project:-
bugRzilla is a package to interact with a bugzilla API and specially with R bugzilla. The goal of the project is to help users to submit issues to R Bugzilla.
About the This Project:-
Explore the issues and bugs on the R Bugzilla to make the submission from bugRzilla better. It might help to identify useful patterns for R core or report the status of the R Bugzilla.
To learn more, see bugzilla_viz.
Setup Database on your local system
Download SQL and MySQL Workbench
To install SQL on Ubuntu one can refer a blog post by digitalocean. To install MySQL workbench on Ubuntu one can refer a blog post by linuxhint
Download R_bugzilla data
- The R_bugzilla data can be downloaded from link.
-
Since the downloaded data is a zip file so make sure you unzip the file by directly using
extract hereoption to the folder you desire before dumping the file which will have an extension.sql(eg: R-bugs.sql).
Dump downloaded R_bugzilla to MySQL workbench.
Before one import the R_bugzilla SQL file one needs to create the (empty) database from MySQL if it doesn’t exist already and the exported SQL don’t contain CREATE DATABASE (exported with –no-create-db or -n option), before you can import it.
After considering this open your Terminal and run the command: mysqldump -u my_username -p database_name > output_file_path or you can use mysql using the command: source <Path>/R-bugs.sql;
-
The
-uflag indicates that the MySQLusernamewill follow. -
The
-pflag indicates we should be prompted for thepasswordassociated with the above username.database_nameis of course the exact name of the database to export. eg.bugRzillais the empty database you created. -
The
>symbol is a Unix directive forSTDOUT, which allows Unix commands to output the text results of the issued command to another location. In this case, that output location is a file path, specified byoutput_file_path.
-
At the command prompt, run the following command to launch the mysql shell and enter it as the root user:
mysql -u root -p -
When you’re prompted for a password, enter the one that you set at installation time, or if you haven’t set one, press Enter to submit no password. The following mysql shell prompt should appear:
mysql> -
In MySQL, I used this to dump the data in the empty database:
-
Create an empty database:
create database bugRzilla; -
To check wheather the database is created or not use:
show databases; -
Once an empty database is created then to dump the SQL data in the database use:
source /home/data/Documents/GSOC/R-bugs.sql; -
To check your database is dumped correctly use:
show tables;mysql> show tables; +---------------------+ | Tables_in_bugRzilla | +---------------------+ | attachments | | bugs | | bugs_activity | | bugs_fulltext | | bugs_mod | | components | | longdescs | +---------------------+ 7 rows in set (0.00 sec)
-
Create an empty database:
bugRzilla Analysis
For the connection to the database, I’m using the dplyr package, it supports to the widely-used open source databases like MySQL.
The libraries used for the analysis:
# loading packages
library(dplyr)##
## Attaching package: 'dplyr'
## The following objects are masked from 'package:stats':
##
## filter, lag
## The following objects are masked from 'package:base':
##
## intersect, setdiff, setequal, union
library(dbplyr)##
## Attaching package: 'dbplyr'
## The following objects are masked from 'package:dplyr':
##
## ident, sql
library(RMySQL)## Loading required package: DBI
library(DBI)
library(DT)
library(tidyverse)## ── Attaching packages ─────────────────────────────────────── tidyverse 1.3.1 ──
## ✓ ggplot2 3.3.5 ✓ purrr 0.3.4
## ✓ tibble 3.1.2 ✓ stringr 1.4.0
## ✓ tidyr 1.1.3 ✓ forcats 0.5.1
## ✓ readr 1.4.0
## ── Conflicts ────────────────────────────────────────── tidyverse_conflicts() ──
## x dplyr::filter() masks stats::filter()
## x dbplyr::ident() masks dplyr::ident()
## x dplyr::lag() masks stats::lag()
## x dbplyr::sql() masks dplyr::sql()
library(skimr)
library(ggplot2)
library(plotly)##
## Attaching package: 'plotly'
## The following object is masked from 'package:ggplot2':
##
## last_plot
## The following object is masked from 'package:stats':
##
## filter
## The following object is masked from 'package:graphics':
##
## layout
Connect bugRzilla SQL Database with R
# Connecting R with MySQL
con <- dbConnect(
MySQL(),
dbname='bugRzilla', # change the database name to your database name
username='root', # change the username to your username
password='1204', # update your password
host='localhost',
port=3306)
# Accessing Tables names from the Database
DBI::dbListTables(con)## [1] "attachments" "bugs" "bugs_activity" "bugs_fulltext"
## [5] "bugs_mod" "components" "longdescs"
Data Exploartion of Bugs Table from the Database
bugs_df <- tbl(con, "bugs")## Warning in .local(conn, statement, ...): Decimal MySQL column 24 imported as
## numeric
## Warning in .local(conn, statement, ...): Decimal MySQL column 25 imported as
## numeric
#for quick view of the datatypes and the structure of data
skim(bugs_df)## Warning in .local(conn, statement, ...): Decimal MySQL column 24 imported as
## numeric
## Warning in .local(conn, statement, ...): Decimal MySQL column 25 imported as
## numeric
| Name | bugs_df |
| Number of rows | 7042 |
| Number of columns | 27 |
| _______________________ | |
| Column type frequency: | |
| character | 15 |
| numeric | 12 |
| ________________________ | |
| Group variables | None |
Variable type: character
| skim_variable | n_missing | complete_rate | min | max | empty | n_unique | whitespace |
|---|---|---|---|---|---|---|---|
| bug_file_loc | 0 | 1 | 0 | 136 | 6990 | 51 | 0 |
| bug_severity | 0 | 1 | 5 | 11 | 0 | 7 | 0 |
| bug_status | 0 | 1 | 3 | 11 | 0 | 8 | 0 |
| creation_ts | 0 | 1 | 19 | 19 | 0 | 7028 | 0 |
| delta_ts | 0 | 1 | 19 | 19 | 0 | 6308 | 0 |
| short_desc | 0 | 1 | 1 | 255 | 0 | 6923 | 0 |
| op_sys | 0 | 1 | 3 | 15 | 0 | 22 | 0 |
| priority | 0 | 1 | 2 | 2 | 0 | 5 | 0 |
| rep_platform | 0 | 1 | 3 | 25 | 0 | 7 | 0 |
| version | 0 | 1 | 3 | 15 | 0 | 43 | 0 |
| resolution | 0 | 1 | 0 | 19 | 564 | 12 | 0 |
| target_milestone | 0 | 1 | 3 | 3 | 0 | 1 | 0 |
| status_whiteboard | 0 | 1 | 0 | 0 | 7042 | 1 | 0 |
| lastdiffed | 0 | 1 | 19 | 19 | 0 | 6324 | 0 |
| deadline | 7008 | 0 | 19 | 19 | 0 | 30 | 0 |
Variable type: numeric
| skim_variable | n_missing | complete_rate | mean | sd | p0 | p25 | p50 | p75 | p100 | hist |
|---|---|---|---|---|---|---|---|---|---|---|
| bug_id | 0 | 1 | 10817.89 | 6189.36 | 1 | 5686.75 | 14101.5 | 16048.75 | 18097 | ▃▁▂▂▇ |
| assigned_to | 0 | 1 | 17.48 | 120.26 | 1 | 2.00 | 5.0 | 16.00 | 2787 | ▇▁▁▁▁ |
| product_id | 0 | 1 | 2.00 | 0.00 | 2 | 2.00 | 2.0 | 2.00 | 2 | ▁▁▇▁▁ |
| reporter | 0 | 1 | 685.69 | 1003.34 | 1 | 2.00 | 2.0 | 1056.00 | 3432 | ▇▂▁▁▁ |
| component_id | 0 | 1 | 9.84 | 5.20 | 2 | 6.00 | 9.0 | 15.00 | 19 | ▇▇▆▃▆ |
| qa_contact | 7042 | 0 | NaN | NA | NA | NA | NA | NA | NA | |
| votes | 0 | 1 | 0.00 | 0.00 | 0 | 0.00 | 0.0 | 0.00 | 0 | ▁▁▇▁▁ |
| everconfirmed | 0 | 1 | 0.83 | 0.38 | 0 | 1.00 | 1.0 | 1.00 | 1 | ▂▁▁▁▇ |
| reporter_accessible | 0 | 1 | 1.00 | 0.00 | 1 | 1.00 | 1.0 | 1.00 | 1 | ▁▁▇▁▁ |
| cclist_accessible | 0 | 1 | 1.00 | 0.00 | 1 | 1.00 | 1.0 | 1.00 | 1 | ▁▁▇▁▁ |
| estimated_time | 0 | 1 | 0.10 | 6.60 | 0 | 0.00 | 0.0 | 0.00 | 552 | ▇▁▁▁▁ |
| remaining_time | 0 | 1 | 0.00 | 0.00 | 0 | 0.00 | 0.0 | 0.00 | 0 | ▁▁▇▁▁ |
- creation_ts
- delta_ts
- lastdiffed
- estimated_time
- remaining_time
- deadline
estimated_time and remaining_time only contains the integer value. So, It can’t be transformed to Date format datatype. Also there are columns which are empty so they are of no use of the analysis like:
- target_milestone
- qa_contact
- status_whiteboard
# Converting `bugs_df` to `dataframe`
bugs_df <- as.data.frame(bugs_df)## Warning in .local(conn, statement, ...): Decimal MySQL column 24 imported as
## numeric
## Warning in .local(conn, statement, ...): Decimal MySQL column 25 imported as
## numeric
Cleaning the data
First steps, check the data and prepare it for what we want:
#converting the required fields in the correct datatype format
bugs_df <- bugs_df %>%
mutate_at(vars("creation_ts", "delta_ts", "lastdiffed", "deadline"), as.Date)
# Taking the columns which are useful
bugs_df <- bugs_df %>%
select("bug_id", "bug_severity", "bug_status", "creation_ts", "delta_ts", "op_sys", "priority", "resolution", "component_id", "version", "lastdiffed", "deadline")
#for quick view of the datatypes and the structure of data
skim(bugs_df)| Name | bugs_df |
| Number of rows | 7042 |
| Number of columns | 12 |
| _______________________ | |
| Column type frequency: | |
| character | 6 |
| Date | 4 |
| numeric | 2 |
| ________________________ | |
| Group variables | None |
Variable type: character
| skim_variable | n_missing | complete_rate | min | max | empty | n_unique | whitespace |
|---|---|---|---|---|---|---|---|
| bug_severity | 0 | 1 | 5 | 11 | 0 | 7 | 0 |
| bug_status | 0 | 1 | 3 | 11 | 0 | 8 | 0 |
| op_sys | 0 | 1 | 3 | 15 | 0 | 22 | 0 |
| priority | 0 | 1 | 2 | 2 | 0 | 5 | 0 |
| resolution | 0 | 1 | 0 | 19 | 564 | 12 | 0 |
| version | 0 | 1 | 3 | 15 | 0 | 43 | 0 |
Variable type: Date
| skim_variable | n_missing | complete_rate | min | max | median | n_unique |
|---|---|---|---|---|---|---|
| creation_ts | 14 | 1 | 1998-08-07 | 2021-05-07 | 2009-12-08 | 4274 |
| delta_ts | 30 | 1 | 1998-08-09 | 2021-05-08 | 2012-07-20 | 3562 |
| lastdiffed | 14 | 1 | 1998-08-07 | 2021-05-08 | 2012-07-10 | 3565 |
| deadline | 7008 | 0 | 2010-04-23 | 2015-04-23 | 2013-11-09 | 30 |
Variable type: numeric
| skim_variable | n_missing | complete_rate | mean | sd | p0 | p25 | p50 | p75 | p100 | hist |
|---|---|---|---|---|---|---|---|---|---|---|
| bug_id | 0 | 1 | 10817.89 | 6189.36 | 1 | 5686.75 | 14101.5 | 16048.75 | 18097 | ▃▁▂▂▇ |
| component_id | 0 | 1 | 9.84 | 5.20 | 2 | 6.00 | 9.0 | 15.00 | 19 | ▇▇▆▃▆ |
#showing the `datatable`
datatable(head(bugs_df, 5), options = list(scrollX = TRUE))About the Bugs Data used for Analysis
I’ve taken the 12 columns under consideration to Analyse the Data. The brief description about the columns as follows:- bug_id: Unique numeric identifier for bug.
- bug_severity: How severe the bug is, e.g. enhancement, critical, etc.
- bug_status: Current status, e.g. NEW, RESOLVED, etc.
- creation_ts: When bug was filed.
- delta_ts: The timestamp of the last update on the bug. This includes updates to some related tables (e.g. “longdescs”).
- op_sys: Operating system bug was seen on, e.g. Windows Vista, Linux, etc.
- priority: The priority of the bug (P1 = most urgent, P5 = least urgent).
- resolution: The resolution, if the bug is in a closed state, e.g. FIXED, DUPLICATE, etc.
- component_id: Numeric ids of the components.
- version: Version of software in which bug is seen.
- lastdiffed: The time at which information about this bug changing was last emailed to the cc list.
- deadline: Date by which bug must be fixed.
Visualizations
bug_created <- bugs_df %>%
ggplot(aes(x = creation_ts, y = bug_id)) +
geom_line(color = "darkorchid4") +
labs(title = "Bug Creation",
subtitle = "The data frame is sent to the plot using pipes",
y = "Bug ID",
x = "Date") +
theme_bw(base_size = 15)
ggplotly(bug_created) From the above the visualizations, The Time-series graph shows that which bug_id was filed in which month and year and from the bar graph we can conclude that in which year the most bugs are filed and when one will zoom the graphs, one can see on which date which bug was filed. The most of the Bugs are filled in the month of January and July.
last_modified <- bugs_df %>%
ggplot(aes(x = lastdiffed, y = bug_id)) +
geom_line() +
labs(title = "Bug Last Modified",
y = "Bug ID",
x = "Date") +
theme_bw(base_size = 15)
ggplotly(last_modified) From the above the visualizations, The Time-series graph shows that which bug_id was last update. Most of the bugs are last updated in the month of January,March, April, and July and in the year from 2014 to 2016 most bugs are modified and in 2019 to 2020 most bugs are filed.
# Plotting the Time Series graph with the bug_id and delta_ts
last_modified_graph <- bugs_df %>%
ggplot(aes(x = delta_ts, y = bug_id)) +
geom_point() +
labs(title = "Bug changing was last emailed to the cc list",
y = "Bug ID",
x = "Date") + theme_bw(base_size = 15)
ggplotly(last_modified_graph) From the above the visualizations, The Time-series graph shows that which bug_id was last update. Most of the bugs are last updated in the month of January,March, April, and July.
Resolution_graph <- ggplot(bugs_df,aes(x = resolution)) +
geom_bar() +
scale_x_discrete(guide = guide_axis(n.dodge = 5)) +
labs(
title = "Bug Resolution Bar graph with Bug Count",
x = "Resolution",
y = "Bug Count"
) + coord_flip()
ggplotly(Resolution_graph) From the above the visualizations, The Resolution bar-graph shows that which bug_id belongs to which category of resolution, if the bug is in a closed state, e.g. FIXED, DUPLICATE, etc. As we can conclude, that most bugs belongs to the fixed category of the resolution.
Status_graph <- ggplot(bugs_df,aes(x = bug_status)) +
geom_bar() +
scale_x_discrete(guide = guide_axis(n.dodge = 5)) +
labs(
title = "Bug Status Bar graph with Bug Count",
x = "Bug Status",
y = "Bug Count"
)
ggplotly(Status_graph) From the above the visualizations, The bug_status bar-graph shows that which bug_id belongs to which category of bug_status, e.g. NEW, RESOLVED, etc. As we can conclude, that most bugs belongs to the closed category of the bug_status.
Severity_graph <- ggplot(bugs_df,aes(x = bug_severity)) +
geom_bar() +
scale_x_discrete(guide = guide_axis(n.dodge = 5)) +
labs(
title = "Bug Severity Bar graph with Bug Count",
x = "Bug Severity",
y = "Bug Count"
)
ggplotly(Severity_graph) From the above the visualizations, The bug_severity bar-graph shows that which bug_id belongs to which category of bug_severity. Most of the bug which are filed are normal, some of the bugs which are filled under enhancements are retested for some features, minor and major and a very few bugs are filed under the blocker category.
Data Exploartion of bugs and Attachments Table from the Database
bugs_attach_df <- tbl(con, "attachments")
# Converting `bugs_attach_df` to `dataframe`
bugs_attach_df <- as.data.frame(bugs_attach_df)
#for quick view of the datatypes and the structure of data
skim(bugs_attach_df)| Name | bugs_attach_df |
| Number of rows | 1823 |
| Number of columns | 11 |
| _______________________ | |
| Column type frequency: | |
| character | 5 |
| numeric | 6 |
| ________________________ | |
| Group variables | None |
Variable type: character
| skim_variable | n_missing | complete_rate | min | max | empty | n_unique | whitespace |
|---|---|---|---|---|---|---|---|
| creation_ts | 0 | 1 | 19 | 19 | 0 | 1771 | 0 |
| modification_time | 0 | 1 | 19 | 19 | 0 | 1630 | 0 |
| description | 0 | 1 | 0 | 174 | 187 | 1485 | 0 |
| mimetype | 0 | 1 | 8 | 71 | 0 | 69 | 0 |
| filename | 0 | 1 | 3 | 70 | 0 | 1522 | 0 |
Variable type: numeric
| skim_variable | n_missing | complete_rate | mean | sd | p0 | p25 | p50 | p75 | p100 | hist |
|---|---|---|---|---|---|---|---|---|---|---|
| attach_id | 0 | 1 | 1876.50 | 572.77 | 1 | 1362.5 | 1895 | 2380.5 | 2838 | ▁▃▇▇▇ |
| bug_id | 0 | 1 | 15351.15 | 3661.05 | 1 | 15004.0 | 16413 | 17369.0 | 18097 | ▁▁▁▁▇ |
| ispatch | 0 | 1 | 0.43 | 0.49 | 0 | 0.0 | 0 | 1.0 | 1 | ▇▁▁▁▆ |
| submitter_id | 0 | 1 | 1313.33 | 1104.28 | 1 | 317.0 | 979 | 2143.0 | 3432 | ▇▆▂▃▃ |
| isobsolete | 0 | 1 | 0.12 | 0.32 | 0 | 0.0 | 0 | 0.0 | 1 | ▇▁▁▁▁ |
| isprivate | 0 | 1 | 0.00 | 0.00 | 0 | 0.0 | 0 | 0.0 | 0 | ▁▁▇▁▁ |
Cleaning attachments Data
bugs_attach_df <- bugs_attach_df %>%
mutate_at(vars("creation_ts", "modification_time"), as.Date) %>%
mutate_at(vars("isobsolete", "isprivate", "ispatch"), as.logical)Joining the bugs and attachments tables
#joining the `attachments` and `bugs` table
baa <- merge(bugs_attach_df, bugs_df, by = intersect(names(bugs_attach_df), names(bugs_df)), all = TRUE)
# Created four columns `creation_month`, `creation_year` and `lastdiffed_month`, `lastdiffed_year` to find in which month and year a bug is created and modified respectively.
baa <- baa %>%
mutate(creation_month = format(creation_ts, "%m"),
creation_year = format(creation_ts, "%Y"),
lastdiffed_month = format(lastdiffed, "%m"),
lastdiffed_year = format(lastdiffed, "%Y")) %>%
group_by(creation_month, creation_year)
#showing the `datatable`
datatable(head(baa, 5), options = list(scrollX = TRUE))About the bugs_activity and attachments Data Used for Analysis
I’ve taken the 15 columns under consideration to Analyse the Data. The brief description about the columns as follows:- bug_id: Unique numeric identifier for bug.
- attach_id: Unique numeric identifier for attachment.
- creation_ts: When bug was filed.
- modification_time: The date and time on which the attachment was last modified.
- description: Text describing the attachment.
-
mimetype: Content type of the attachment like
text/plainorimage/png. - ispatch: Whether attachment is a patch.
- filename :Path-less file-name of attachment.
- submitter_id: Unique numeric identifier for who submitted the bug.
- isobsolete: Whether attachment is marked obsolete.
-
isprivate:
TRUEif the attachment should beprivateandFALSEif the attachment should bepublic. - creation_month: The month in which the bug is created.
- creation_year: The year in which the bug is created.
- lastdiffed_month: The month in which the bug is last modified.
- lastdiffed_year: The year in which the bug is last modified.
Visualizations
#Counting number of bugs per month in an year
bugs_counts <- baa %>%
arrange(bug_id) %>%
count(creation_year)
skim(head(bugs_counts))| Name | head(bugs_counts) |
| Number of rows | 6 |
| Number of columns | 3 |
| _______________________ | |
| Column type frequency: | |
| numeric | 1 |
| ________________________ | |
| Group variables | creation_month, creation_year |
Variable type: numeric
| skim_variable | creation_month | creation_year | n_missing | complete_rate | mean | sd | p0 | p25 | p50 | p75 | p100 | hist |
|---|---|---|---|---|---|---|---|---|---|---|---|---|
| n | 01 | 1999 | 0 | 1 | 17 | NA | 17 | 17 | 17 | 17 | 17 | ▁▁▇▁▁ |
| n | 01 | 2000 | 0 | 1 | 13 | NA | 13 | 13 | 13 | 13 | 13 | ▁▁▇▁▁ |
| n | 01 | 2001 | 0 | 1 | 30 | NA | 30 | 30 | 30 | 30 | 30 | ▁▁▇▁▁ |
| n | 01 | 2002 | 0 | 1 | 41 | NA | 41 | 41 | 41 | 41 | 41 | ▁▁▇▁▁ |
| n | 01 | 2003 | 0 | 1 | 30 | NA | 30 | 30 | 30 | 30 | 30 | ▁▁▇▁▁ |
| n | 01 | 2004 | 0 | 1 | 21 | NA | 21 | 21 | 21 | 21 | 21 | ▁▁▇▁▁ |
Note: Here only I’ve shown the overview of only the 6 rows of bugs_counts Since the whole summary of the data is very large.
# 3D plot to see the number of bug counts per month in a year
bug_count_graph <- plot_ly(
x = bugs_counts$creation_month,
y = bugs_counts$creation_year,
z = bugs_counts$n,
type="scatter3d",
mode="markers", marker = list(size=2))
bug_count_graph <- bug_count_graph %>%
layout(
title = "Bug Counts per month in a year"
)
bug_count_graph## Warning: Ignoring 1 observations
The above visualization is about the number of bugs counts per month in a year. The Most number of bug count is 77 in the April, 2015 and the minimum bug_count is 2.
#filtering the data where resolution is Duplicate
res_dupli <- baa %>%
filter(resolution == "DUPLICATE" & bug_status == "CLOSED")# plotting graph with creation month where resolution is Duplicate
dupli_month_graph <- ggplot(res_dupli) +
geom_bar(aes(x = creation_month)) +
labs(
title = "Months in which Duplicate Bugs are Filed",
x = "Months",
y = "Bug_Count"
)
ggplotly(dupli_month_graph) The above above Visualization is about the month in which is bugs are filed where resolution is Duplicate. From the graph we can see that the most wast filled in the month of August having a bug count of 12 and the least bugs are filled in the month of December having a bug count of 2. This graph is from year 2006 to 2021.
# plotting graph with creation year where resolution is Duplicate
duplicate_year <- ggplot(res_dupli) +
geom_bar(aes(x = creation_year)) +
labs(
title = "Year in which Duplicate Bugs are Filed",
x = "Year",
y = "Bug_Count"
)
ggplotly(duplicate_year) The above above Visualization is about the year in which is bugs are filed where resolution is Duplicate. From the graph we can see that the most wast filled in the year 2012 having a bug count of 11 and the least bugs are filled in the year 2007, 2008 and 2019 having a bug count of 1. This graph is from year 2006 to 2021.
#filtering the data where resolution is Fixed
res_fixed <- baa %>%
filter(resolution == "FIXED" & bug_status == "CLOSED")# plotting graph with last modified year where resolution is Fixed
fixed_year_graph <- ggplot(res_fixed) +
geom_bar(aes(x = lastdiffed_year)) +
labs(
title = "Year in which fixed bugs are last modified",
x = "Year",
y = "Bug_Count"
) +
coord_flip()
ggplotly(fixed_year_graph) The above above Visualization is about the year in which is bugs are last modified where resolution is Fixed and their status is closed. From the graph we can see that the most wast last modified in the year 2002 having a bug count of 328 and In year, 2021 47 bugs are fixed and closed.
# plotting graph with creation year where resolution is Fixed
fixed_month_graph <- ggplot(res_fixed) +
geom_bar(aes(x = lastdiffed_month)) +
labs(
title = "Month in which fixed bugs are last modified",
x = "Month ",
y = "Bug_Count"
)
ggplotly(fixed_month_graph) The above above Visualization is about the month in which is bugs are last modified where resolution is Fixed. From the graph we can see that the most wast last modified in the month December having a bug count of 559 and in the month of September having a bug count of 228 are least modified. This graph is from year 1998 to 2021.
res_invalid <- baa %>%
filter(resolution == "INVALID" & bug_status == "CLOSED")
invalid_graph <- ggplot(res_invalid) +
geom_bar(aes(x = creation_month)) +
labs(
title = "Month in which invalid bugs are filed",
x = "Month ",
y = "Bug_Count"
)
ggplotly(invalid_graph) The above above Visualization is about the month in which is bugs are last modified where resolution is Invalid and status is closed. From the graph we can see that the most wast created in the month October having a bug count of 625 and in the month of Febuary having a bug count of 491 are created. This graph is from year 1998 to 2021.
invaild_year_graph <- ggplot(res_invalid) +
geom_bar(aes(x = creation_year)) +
labs(
title = "Year in which INVALID Bugs are Filed",
x = "Year",
y = "Bug_Count"
) + coord_flip()
ggplotly(invaild_year_graph) This Visualization refers to the Creation of the Invalid bugs. In year, 1998 the a total of 63 Invalid bugs are created which are least and in the year 2013 a total of 431 bugs are filed which are most.
priority_graph <- baa %>%
ggplot(aes(x = creation_year, y = bug_id)) +
geom_point() +
facet_wrap( ~priority) +
labs(title = "Bugs created year with their priorities",
y = "Bug ID",
x = "Date") + theme_bw(base_size = 9) +
coord_flip()
ggplotly(priority_graph) The above visualization gives the insight about the bugs when they are created and under which priority the fall like from the above plot we can conclude that the majority of the bugs are filed under the P5 which is having the least priority.
Data Exploartion of bugs_mod Table from the Database
bugs_mod_df <- tbl(con, "bugs_mod")
# Converting `bugs_mod_df to `dataframe`
bugs_mod_df <- as.data.frame(bugs_mod_df)
#for quick view of the datatypes and the structure of data
skim(bugs_mod_df)| Name | bugs_mod_df |
| Number of rows | 7042 |
| Number of columns | 28 |
| _______________________ | |
| Column type frequency: | |
| character | 18 |
| numeric | 10 |
| ________________________ | |
| Group variables | None |
Variable type: character
| skim_variable | n_missing | complete_rate | min | max | empty | n_unique | whitespace |
|---|---|---|---|---|---|---|---|
| row_names | 0 | 1.00 | 1 | 4 | 0 | 7042 | 0 |
| bug_file_loc | 7042 | 0.00 | NA | NA | 0 | 0 | 0 |
| bug_severity | 0 | 1.00 | 5 | 11 | 0 | 7 | 0 |
| bug_status | 0 | 1.00 | 3 | 11 | 0 | 8 | 0 |
| creation_ts | 14 | 1.00 | 10 | 10 | 0 | 4274 | 0 |
| delta_ts | 7042 | 0.00 | NA | NA | 0 | 0 | 0 |
| short_desc | 0 | 1.00 | 1 | 255 | 0 | 6923 | 0 |
| op_sys | 7042 | 0.00 | NA | NA | 0 | 0 | 0 |
| priority | 0 | 1.00 | 2 | 2 | 0 | 5 | 0 |
| rep_platform | 0 | 1.00 | 3 | 25 | 0 | 7 | 0 |
| version | 0 | 1.00 | 3 | 15 | 0 | 43 | 0 |
| resolution | 564 | 0.92 | 4 | 19 | 0 | 11 | 0 |
| target_milestone | 0 | 1.00 | 3 | 3 | 0 | 1 | 0 |
| status_whiteboard | 7042 | 0.00 | NA | NA | 0 | 0 | 0 |
| lastdiffed | 7042 | 0.00 | NA | NA | 0 | 0 | 0 |
| estimated_time | 0 | 1.00 | 4 | 6 | 0 | 19 | 0 |
| remaining_time | 0 | 1.00 | 4 | 4 | 0 | 1 | 0 |
| deadline | 7008 | 0.00 | 10 | 10 | 0 | 30 | 0 |
Variable type: numeric
| skim_variable | n_missing | complete_rate | mean | sd | p0 | p25 | p50 | p75 | p100 | hist |
|---|---|---|---|---|---|---|---|---|---|---|
| bug_id | 0 | 1 | 10817.89 | 6189.36 | 1 | 5686.75 | 14101.5 | 16048.75 | 18097 | ▃▁▂▂▇ |
| assigned_to | 0 | 1 | 17.48 | 120.26 | 1 | 2.00 | 5.0 | 16.00 | 2787 | ▇▁▁▁▁ |
| product_id | 0 | 1 | 2.00 | 0.00 | 2 | 2.00 | 2.0 | 2.00 | 2 | ▁▁▇▁▁ |
| reporter | 0 | 1 | 685.69 | 1003.34 | 1 | 2.00 | 2.0 | 1056.00 | 3432 | ▇▂▁▁▁ |
| component_id | 0 | 1 | 9.84 | 5.20 | 2 | 6.00 | 9.0 | 15.00 | 19 | ▇▇▆▃▆ |
| qa_contact | 7042 | 0 | NaN | NA | NA | NA | NA | NA | NA | |
| votes | 0 | 1 | 0.00 | 0.00 | 0 | 0.00 | 0.0 | 0.00 | 0 | ▁▁▇▁▁ |
| everconfirmed | 0 | 1 | 0.83 | 0.38 | 0 | 1.00 | 1.0 | 1.00 | 1 | ▂▁▁▁▇ |
| reporter_accessible | 0 | 1 | 1.00 | 0.00 | 1 | 1.00 | 1.0 | 1.00 | 1 | ▁▁▇▁▁ |
| cclist_accessible | 0 | 1 | 1.00 | 0.00 | 1 | 1.00 | 1.0 | 1.00 | 1 | ▁▁▇▁▁ |
#showing the baa i.e `bugs_mod_df` table in the `datatable`
datatable(head(bugs_mod_df, 5), options = list(scrollX = TRUE))Data Exploartion of longdescs Table from the Database
longdescs_df <- tbl(con, "longdescs")## Warning in .local(conn, statement, ...): Decimal MySQL column 4 imported as
## numeric
# Converting `longdescs_df` to `dataframe`
longdescs_df <- as.data.frame(longdescs_df)## Warning in .local(conn, statement, ...): Decimal MySQL column 4 imported as
## numeric
#for quick view of the datatypes and the structure of data
skim(longdescs_df)| Name | longdescs_df |
| Number of rows | 26942 |
| Number of columns | 11 |
| _______________________ | |
| Column type frequency: | |
| character | 3 |
| numeric | 8 |
| ________________________ | |
| Group variables | None |
Variable type: character
| skim_variable | n_missing | complete_rate | min | max | empty | n_unique | whitespace |
|---|---|---|---|---|---|---|---|
| bug_when | 0 | 1.00 | 19 | 19 | 0 | 26270 | 0 |
| thetext | 0 | 1.00 | 0 | 422285 | 772 | 25588 | 0 |
| extra_data | 24966 | 0.07 | 1 | 5 | 0 | 1948 | 0 |
Variable type: numeric
| skim_variable | n_missing | complete_rate | mean | sd | p0 | p25 | p50 | p75 | p100 | hist |
|---|---|---|---|---|---|---|---|---|---|---|
| comment_id | 0 | 1 | 83378.70 | 7986.99 | 1 | 76528.25 | 83263.5 | 90215.75 | 97284 | ▁▁▁▃▇ |
| bug_id | 0 | 1 | 10479.44 | 6260.77 | 1 | 4195.00 | 13361.0 | 16072.00 | 18097 | ▅▁▃▂▇ |
| who | 0 | 1 | 457.47 | 896.85 | 1 | 2.00 | 2.0 | 412.00 | 3432 | ▇▁▁▁▁ |
| work_time | 0 | 1 | 0.00 | 0.04 | 0 | 0.00 | 0.0 | 0.00 | 5 | ▇▁▁▁▁ |
| isprivate | 0 | 1 | 0.00 | 0.00 | 0 | 0.00 | 0.0 | 0.00 | 0 | ▁▁▇▁▁ |
| already_wrapped | 0 | 1 | 0.00 | 0.00 | 0 | 0.00 | 0.0 | 0.00 | 0 | ▁▁▇▁▁ |
| type | 0 | 1 | 0.35 | 1.26 | 0 | 0.00 | 0.0 | 0.00 | 6 | ▇▁▁▁▁ |
| is_markdown | 0 | 1 | 0.04 | 0.20 | 0 | 0.00 | 0.0 | 0.00 | 1 | ▇▁▁▁▁ |
#showing the baa i.e `longdescs_df` table in the `datatable`
datatable(head(longdescs_df, 5), options = list(scrollX = TRUE))Data Exploartion of bugs_activity Table from the Database
bugs_act_df <- tbl(con, "bugs_activity")
# Converting `longdescs_df` to `dataframe`
bugs_act_df <- as.data.frame(bugs_act_df)
#for quick view of the datatypes and the structure of data
skim(bugs_act_df)| Name | bugs_act_df |
| Number of rows | 15114 |
| Number of columns | 9 |
| _______________________ | |
| Column type frequency: | |
| character | 3 |
| numeric | 6 |
| ________________________ | |
| Group variables | None |
Variable type: character
| skim_variable | n_missing | complete_rate | min | max | empty | n_unique | whitespace |
|---|---|---|---|---|---|---|---|
| bug_when | 0 | 1 | 19 | 19 | 0 | 7290 | 0 |
| added | 0 | 1 | 0 | 133 | 142 | 400 | 0 |
| removed | 0 | 1 | 0 | 122 | 7656 | 374 | 0 |
Variable type: numeric
| skim_variable | n_missing | complete_rate | mean | sd | p0 | p25 | p50 | p75 | p100 | hist |
|---|---|---|---|---|---|---|---|---|---|---|
| bug_id | 0 | 1.00 | 15286.06 | 2884.99 | 1 | 14762.25 | 15738.0 | 16894.00 | 18097 | ▁▁▁▁▇ |
| attach_id | 14880 | 0.02 | 2053.50 | 509.98 | 1 | 1711.25 | 2128.0 | 2420.50 | 2833 | ▁▁▅▇▇ |
| who | 0 | 1.00 | 462.89 | 892.91 | 1 | 6.00 | 18.0 | 308.00 | 3432 | ▇▁▁▁▁ |
| fieldid | 0 | 1.00 | 14.02 | 6.83 | 2 | 9.00 | 12.0 | 20.00 | 54 | ▇▃▁▁▁ |
| comment_id | 15086 | 0.00 | 92539.75 | 1993.95 | 89370 | 90952.75 | 91992.5 | 93529.25 | 96991 | ▆▇▅▂▂ |
| id | 0 | 1.00 | 8566.00 | 5187.40 | 1 | 3791.25 | 8723.0 | 13347.75 | 17143 | ▇▆▇▆▇ |
Joing all the the data tables
#joining all the data tables
total_data <- merge(bugs_df, bugs_act_df, by = intersect(names(bugs_df),
names(bugs_act_df)), all = TRUE) %>%
merge(., bugs_attach_df, by = intersect(names(.),
names(bugs_attach_df)), all = TRUE) %>%
merge(., bugs_mod_df, by = intersect(names(.),
names(bugs_mod_df)), all = TRUE)
# creating a creation_year column
total_data$creation_year <- as.Date(cut(total_data$creation_ts,
breaks = "year"))
# creating a creation_month column
total_data$creation_month <- as.Date(cut(total_data$creation_ts,
breaks = "month"))
# creating a creation_week column
total_data$creation_week <- as.Date(cut(total_data$creation_ts,
breaks = "week", start.on.monday = FALSE))
# creating a lastdiffed_year column
total_data$lastdiffed_year <- as.Date(cut(total_data$lastdiffed,
breaks = "year"))
# creating a lastdiffed_year column
total_data$lastdiffed_month <- as.Date(cut(total_data$lastdiffed,
breaks = "month"))
# creating a lastdiffed_year column
total_data$lastdiffed_week <- as.Date(cut(total_data$lastdiffed,
breaks = "week", start.on.monday = FALSE))
# selecting required columns for the analysis
total_data <- total_data %>%
select("bug_id", "creation_ts", "bug_severity", "bug_status", "delta_ts", "op_sys", "priority", "resolution",
"component_id", "version", "lastdiffed", "deadline", "attach_id", "who", "bug_when", "fieldid", "added",
"removed", "modification_time", "description", "mimetype", "ispatch", "filename", "submitter_id",
"isobsolete", "isprivate", "assigned_to", "product_id", "reporter", "creation_year", "creation_month",
"creation_week", "lastdiffed_year", "lastdiffed_month", "lastdiffed_week")
#counting total_creation_bug_count
total_creation_bug_count <- total_data %>%
arrange(creation_year) %>%
count(bug_id)
# joining total_data and total_bug_count
total_data <- merge(total_creation_bug_count, total_data, by = intersect(names(total_creation_bug_count),
names(total_data)), all = TRUE)
#for quick view of the datatypes and the structure of data
skim(total_data)| Name | total_data |
| Number of rows | 27199 |
| Number of columns | 36 |
| _______________________ | |
| Column type frequency: | |
| character | 12 |
| Date | 11 |
| logical | 3 |
| numeric | 10 |
| ________________________ | |
| Group variables | None |
Variable type: character
| skim_variable | n_missing | complete_rate | min | max | empty | n_unique | whitespace |
|---|---|---|---|---|---|---|---|
| bug_severity | 1697 | 0.94 | 5 | 11 | 0 | 7 | 0 |
| bug_status | 1697 | 0.94 | 3 | 11 | 0 | 8 | 0 |
| op_sys | 8739 | 0.68 | 3 | 15 | 0 | 22 | 0 |
| priority | 1697 | 0.94 | 2 | 2 | 0 | 5 | 0 |
| resolution | 2261 | 0.92 | 0 | 19 | 1251 | 12 | 0 |
| version | 1697 | 0.94 | 3 | 15 | 0 | 43 | 0 |
| bug_when | 12085 | 0.56 | 19 | 19 | 0 | 7290 | 0 |
| added | 12085 | 0.56 | 0 | 133 | 142 | 400 | 0 |
| removed | 12085 | 0.56 | 0 | 122 | 7656 | 374 | 0 |
| description | 25373 | 0.07 | 0 | 174 | 187 | 1485 | 0 |
| mimetype | 25373 | 0.07 | 8 | 71 | 0 | 69 | 0 |
| filename | 25373 | 0.07 | 3 | 70 | 0 | 1522 | 0 |
Variable type: Date
| skim_variable | n_missing | complete_rate | min | max | median | n_unique |
|---|---|---|---|---|---|---|
| creation_ts | 30 | 1.00 | 1998-08-07 | 2021-05-07 | 2012-12-31 | 4443 |
| delta_ts | 8770 | 0.68 | 1998-08-09 | 2021-05-08 | 2015-12-14 | 3562 |
| lastdiffed | 8753 | 0.68 | 1998-08-07 | 2021-05-08 | 2015-12-14 | 3565 |
| deadline | 27026 | 0.01 | 2010-04-23 | 2015-04-23 | 2013-11-08 | 30 |
| modification_time | 25375 | 0.07 | 1998-12-04 | 2021-05-07 | 2015-08-23 | 1093 |
| creation_year | 30 | 1.00 | 1998-01-01 | 2021-01-01 | 2012-01-01 | 24 |
| creation_month | 30 | 1.00 | 1998-08-01 | 2021-05-01 | 2012-12-01 | 274 |
| creation_week | 30 | 1.00 | 1998-08-02 | 2021-05-02 | 2012-12-30 | 1166 |
| lastdiffed_year | 8753 | 0.68 | 1998-01-01 | 2021-01-01 | 2015-01-01 | 24 |
| lastdiffed_month | 8753 | 0.68 | 1998-08-01 | 2021-05-01 | 2015-12-01 | 272 |
| lastdiffed_week | 8753 | 0.68 | 1998-08-02 | 2021-05-02 | 2015-12-13 | 1127 |
Variable type: logical
| skim_variable | n_missing | complete_rate | mean | count |
|---|---|---|---|---|
| ispatch | 25373 | 0.07 | 0.43 | FAL: 1045, TRU: 781 |
| isobsolete | 25373 | 0.07 | 0.12 | FAL: 1615, TRU: 211 |
| isprivate | 25373 | 0.07 | 0.00 | FAL: 1826 |
Variable type: numeric
| skim_variable | n_missing | complete_rate | mean | sd | p0 | p25 | p50 | p75 | p100 | hist |
|---|---|---|---|---|---|---|---|---|---|---|
| bug_id | 0 | 1.00 | 12992.23 | 5390.83 | 1 | 10414.0 | 15152.0 | 16615.5 | 18097 | ▂▁▁▁▇ |
| n | 0 | 1.00 | 5.65 | 4.40 | 2 | 3.0 | 5.0 | 7.0 | 51 | ▇▁▁▁▁ |
| component_id | 1697 | 0.94 | 9.82 | 5.30 | 2 | 6.0 | 9.0 | 15.0 | 19 | ▇▇▆▃▇ |
| attach_id | 25268 | 0.07 | 1887.74 | 569.83 | 1 | 1378.5 | 1920.0 | 2380.5 | 2838 | ▁▃▇▇▇ |
| who | 12085 | 0.56 | 462.89 | 892.91 | 1 | 6.0 | 18.0 | 308.0 | 3432 | ▇▁▁▁▁ |
| fieldid | 12085 | 0.56 | 14.02 | 6.83 | 2 | 9.0 | 12.0 | 20.0 | 54 | ▇▃▁▁▁ |
| submitter_id | 25373 | 0.07 | 1314.49 | 1104.91 | 1 | 317.0 | 983.5 | 2148.0 | 3432 | ▇▆▂▃▃ |
| assigned_to | 20157 | 0.26 | 17.48 | 120.26 | 1 | 2.0 | 5.0 | 16.0 | 2787 | ▇▁▁▁▁ |
| product_id | 20157 | 0.26 | 2.00 | 0.00 | 2 | 2.0 | 2.0 | 2.0 | 2 | ▁▁▇▁▁ |
| reporter | 20157 | 0.26 | 685.69 | 1003.34 | 1 | 2.0 | 2.0 | 1056.0 | 3432 | ▇▂▁▁▁ |
datatable(head(total_data, 13), options = list(scrollX = TRUE))Visualizations
cre_last_year_graph <- total_data %>%
ggplot(aes(y = n)) +
geom_line(aes(x = creation_year)) +
geom_line(aes(x = lastdiffed_year), color="steelblue", linetype="twodash") +
labs(
title = "Year in which bugs are Created vs Last modified",
x = "year",
y = "Bug_Count"
)
ggplotly(cre_last_year_graph) In the year 2015 and 2016, a total of 51 bugs are created and in the year of 2020 a total of 51 bugs are modified as well as in the year 2015, 31 bugs are modified. From the year 2013 to 2016 their is a peak in the creation and modification of the bugs and from the year 2003 to 2009 the is a almost a linear downfall and rising of the creation and modification of the bugs.
cre_last_month_graph <- total_data %>%
ggplot(aes(y = n)) +
geom_line(aes(x = creation_month)) +
geom_line(aes(x = lastdiffed_month), color="steelblue", linetype="twodash") +
labs(
title = "Month in which are Created vs Last modified",
x = "Month",
y = "Bug_Count"
)
ggplotly(cre_last_month_graph) In the month December, 2015 and January, 2016, a total of 51 bugs are created which is the sudden increase. In the September, 2016 to October, 2016 and October, 2016 to November, 2016 their is a linear increase in the creation of the bugs. From the last_modified graph, in the November, 2020 a total of 51 bugs are last modified mostly in every month a avg of 2 bugs are modified from till 2010 but from the 2011 their is an increase in the modification of the bugs having a avg of 9 bugs.
cre_last_week_graph <- total_data %>%
ggplot(aes(y = n)) +
geom_line(aes(x = creation_week)) +
geom_line(aes(x = lastdiffed_week), color="steelblue", linetype="twodash") +
labs(
title = "Week in which bugs are Created vs Last modified",
x = "Week",
y = "Bug_Count"
)
ggplotly(cre_last_week_graph) In the December, 2015 and January, 2016, a total of 51 bugs are created per week which is the sudden increase. In the September, 2016 to October, 2016 and October, 2016 to November, 2016 their is a linear increase in the creation of the bugs. From the last_modified graph, in the November, 2020 a total of 51 bugs are last modified mostly in every week a avg of 2 or 3 bugs are modified from till 2010 but from the 2011 their is an increase in the modification of the bugs having a avg of 10 to 12 bugs each week.
Conclusion
In this project, I’ve visualized the bugRzilla database creating various visualization.-
Starting from the line-graph between the
bug_idandcreation_ts(date), this visualization is about when the bug was created. The Time-series graph shows that which bug_id was filed in which month and year and from the bar graph we can conclude that in which year the most bugs are filed and when one will zoom the graphs, one can see on which date which bug was filed. The most of the Bugs are filled in the month ofJanuaryandJuly. -
The next graph is plotted between
bug_idandlast_modified, theTime-series graphshows that which bug_id was last update. Most of the bugs are last updated in the month ofJanuary,March,April, andJulyand in the year from2014to2016most bugs are modified and in2019to2020most bugs are filed. -
The next graph is plotted between
bug_idanddelta_ts, theTime-series graphshows that which bug_id was last update. Most of the bugs are last updated in the month ofJanuary,March,April, andJuly. -
The bar-graph between the
bug_countandresolution, this visualization is about theResolutionbar-graph shows that which bug_id belongs to which category of resolution, if the bug is in a closed state, e.g. FIXED, DUPLICATE, etc. As we can conclude, that most bugs belongs to the fixed category of the resolution. -
The next graph is plotted between
bug_idandbug_status, the bug_status bar-graph shows that which bug_id belongs to which category of bug_status, e.g. NEW, RESOLVED, etc. As we can conclude, that most bugs belongs to the closed category of the bug_status.
6
-
The next graph is plotted between
bug_idandbug_severity, the bug_severity bar-graph shows that which bug_id belongs to which category of bug_severity. Most of the bug which are filed are normal, some of the bugs which are filled under enhancements are retested for some features, minor and major and a very few bugs are filed under the blocker category. -
The next graph is plotted between
creation_month,bug_countandcreation_year, The above visualization is about the number of bugs counts per month in a year. The Most number of bug count is 77 in the April, 2015 and the minimum bug_count is 2. -
The next graph is plotted between
bug_countandcreation_month, the above above Visualization is about the month in which is bugs are filed where resolution is Duplicate. From the graph we can see that the most wast filled in the month of August having a bug count of 12 and the least bugs are filled in the month of December having a bug count of 2. This graph is from year 2006 to 2021. -
The next graph is plotted between
bug_countandcreation_year, the above above Visualization is about the year in which is bugs are filed where resolution is Duplicate. From the graph we can see that the most wast filled in the year 2012 having a bug count of 11 and the least bugs are filled in the year 2007, 2008 and 2019 having a bug count of 1. This graph is from year 2006 to 2021. -
The next graph is plotted between
bug_countandlastdiffed_year, The above above Visualization is about the year in which is bugs are last modified where resolution is Fixed and their status is closed. From the graph we can see that the most wast last modified in the year 2002 having a bug count of 328 and In year, 2021 47 bugs are fixed and closed. -
The next graph is plotted between
bug_countandlastdiffed_month, he above above Visualization is about the month in which is bugs are last modified where resolution is Fixed. From the graph we can see that the most wast last modified in the month December having a bug count of 559 and in the month of September having a bug count of 228 are least modified. This graph is from year 1998 to 2021 -
The next graph is plotted between
bug_countandcreation_month, the above above Visualization is about the month in which is bugs are last modified where resolution is Invalid and status is closed. From the graph we can see that the most wast created in the month October having a bug count of 625 and in the month of Febuary having a bug count of 491 are created. This graph is from year 1998 to 2021. -
The next graph is plotted between
bug_countandcreation_year, This Visualization refers to the Creation of the Invalid bugs. In year, 1998 the a total of 63 Invalid bugs are created which are least and in the year 2013 a total of 431 bugs are filed which are most. -
The next graph is plotted between
creation_yearandbug_id, The above visualization gives the insight about the bugs when they are created and under which priority the fall like from the above plot we can conclude that the majority of the bugs are filed under the P5 which is having the least priority. -
The next graph is plotted between
bug_countandcreation_year, This Visualization refers to the Creation of the Invalid bugs. In year, 1998 the a total of 63 Invalid bugs are created which are least and in the year 2013 a total of 431 bugs are filed which are most. -
The next graph is plotted between
bug_count,creation_yearandlastdiffed_year, In the year 2015 and 2016, a total of 51 bugs are created and in the year of 2020 a total of 51 bugs are modified as well as in the year 2015, 31 bugs are modified. From the year 2013 to 2016 their is a peak in the creation and modification of the bugs and from the year 2003 to 2009 the is a almost a linear downfall and rising of the creation and modification of the bugs. - In the month December, 2015 and January, 2016, a total of 51 bugs are created which is the sudden increase. In the September, 2016 to October, 2016 and October, 2016 to November, 2016 their is a linear increase in the creation of the bugs. From the last_modified graph, in the November, 2020 a total of 51 bugs are last modified mostly in every month a avg of 2 bugs are modified from till 2010 but from the 2011 their is an increase in the modification of the bugs having a avg of 9 bugs.
-
The next graph is plotted between
bug_count,creation_weekandlastdiffed_week, In the December, 2015 and January, 2016, a total of 51 bugs are created per week which is the sudden increase. In the September, 2016 to October, 2016 and October, 2016 to November, 2016 their is a linear increase in the creation of the bugs. From the last_modified graph, in the November, 2020 a total of 51 bugs are last modified mostly in every week a avg of 2 or 3 bugs are modified from till 2010 but from the 2011 their is an increase in the modification of the bugs having a avg of 10 to 12 bugs each week.
dbDisconnect(con)## [1] TRUE